Configurable max_tokens/max_completion_tokens key #399

sjmonson · 2025-10-09T19:03:49Z

Summary

Makes the max_tokens request key configurable through an environment variable per endpoint type. Defaults to max_tokens for legacy completions and max_completion_tokens for chat/completions

Details

Add the GUIDELLM__OPENAI__MAX_OUTPUT_KEY config option which is a dict mapping from route name -> output tokens key. Default is {"text_completions": "max_tokens", "chat_completions": "max_completion_tokens"}

Test Plan

Related Issues

Closes rm max_completion_tokens #395
Closes Guidellm adds unexpected field to requests #269
Related Possible invalid request formatting for `max_completion_tokens` #210

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Tyler Michael Smith <[email protected]>

Signed-off-by: Samuel Monson <[email protected]>

Copilot

Pull Request Overview

This PR implements configurable request keys for output token limits in OpenAI API calls. Instead of hardcoding both max_tokens and max_completion_tokens in all requests, the system now uses the appropriate key based on endpoint type through a new environment variable configuration.

Adds GUIDELLM__OPENAI__MAX_OUTPUT_KEY configuration mapping endpoint types to their respective output token keys
Updates payload generation to use the configured key instead of setting both keys
Fixes test assertions to match the new single-key approach

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/guidellm/config.py	Adds new max_output_key configuration with defaults for text and chat completions
src/guidellm/backend/openai.py	Updates payload generation to use configurable key and adds type definitions
tests/unit/conftest.py	Removes duplicate token limit assertions and fixes mock response generation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/guidellm/backend/openai.py

jaredoconnell

You may want to wait for Mark's review, but looks good to me.

This reverts commit 121dcdc.

rm max_completion_tokens

7a0b160

Signed-off-by: Tyler Michael Smith <[email protected]>

sjmonson changed the title ~~Fix/drop max completion tokens~~ Configurable max_tokens/max_completion_tokens key Oct 9, 2025

sjmonson added 2 commits October 9, 2025 15:24

Set max_tokens key name based on completion endpoint

6b7f10c

Signed-off-by: Samuel Monson <[email protected]>

Fix for backend tests

ef981fd

Signed-off-by: Samuel Monson <[email protected]>

sjmonson force-pushed the fix/drop_max_completion_tokens branch from 68e69bc to ef981fd Compare October 9, 2025 19:30

sjmonson requested review from markurtz and Copilot October 9, 2025 20:01

Copilot AI reviewed Oct 9, 2025

View reviewed changes

src/guidellm/backend/openai.py Show resolved Hide resolved

src/guidellm/backend/openai.py Show resolved Hide resolved

sjmonson mentioned this pull request Oct 9, 2025

rm max_completion_tokens #395

Closed

4 tasks

jaredoconnell approved these changes Oct 9, 2025

View reviewed changes

markurtz approved these changes Oct 10, 2025

View reviewed changes

Merge branch 'main' into fix/drop_max_completion_tokens

03d7482

sjmonson merged commit 121dcdc into main Oct 10, 2025
17 checks passed

sjmonson deleted the fix/drop_max_completion_tokens branch October 10, 2025 13:36

sjmonson added a commit that referenced this pull request Oct 10, 2025

Revert "Configurable max_tokens/max_completion_tokens key (#399)"

60b77e5

This reverts commit 121dcdc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Configurable max_tokens/max_completion_tokens key #399

Configurable max_tokens/max_completion_tokens key #399

sjmonson commented Oct 9, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

jaredoconnell left a comment

Uh oh!

Uh oh!

Uh oh!

Configurable max_tokens/max_completion_tokens key #399

Configurable max_tokens/max_completion_tokens key #399

Conversation

sjmonson commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

jaredoconnell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sjmonson commented Oct 9, 2025 •

edited

Loading